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Abstract 

A new method for calculation of goodness of multidimensional fits in particle physics experiments 
is proposed. This method finds the smallest and largest clusters of nearest neighbors for observed 
data points. The cluster size is used to estimate the goodness-of-fit and the cluster location 
provides clues about possible problems with data modeling. The performance of the new method 
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studies. The new method is applied to estimate the goodness-of-fit in a B — > Kll analysis at 
BaBar. 
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1. INTRODUCTION 



Fits are broadly used in analysis of particle physics experiments. If sufficient statistics 
is accumulated, one usually plots observed data as a histogram and overlays an expected 
histogram or modeling function. The goodness-of-fit is then estimated by taking the de- 
viation of the observed number of events in each bin from the expected number of events 
in the same bin, summing the squares of these deviations to form a x 2 value and hence 
compute a significance level. This procedure assumes that the bin contents are normally 
distributed, which is true only asymptotically in the large statistics limit. In low-statistics 
experiments, one typically observes zero or just a few events per bin, and this procedure 
does not produce a reliable result. One must then use other methods. Unbinned maximum 
likelihood methods, discussed below, have been recently used in such situations in BaBar 
and elsewhere. We also discuss Kolmogorov-Smirnov test, another well-known method. 

In this note, we concentrate on the most general problem: how to test a distribution 
in question against every reasonable alternative hypothesis. In other words, the null and 
alternative hypotheses are stated as follows: 
H : the observed data obey the expected distribution. 

Hi : the observed data obey some other unknown but plausible distribution. 

The goodness-of-fit, 1 — a, is defined as the confidence level of the null hypothesis, and a 

is therefore a Type I error. We remind the reader that the Type II error is traditionally 

defined as the probability of accepting the null hypothesis if the alternative hypothesis is 

true. 

This definition of the problem conforms with the standard x 2 t es t °f binned data. In- 
deed, the x 2 t es t computes discrepancy between expected and observed probability density 
functions {pdfs) without imposing constraints on the alternative hypothesis. We do not 
discuss examples, where the alternative hypothesis Hi can be stated in a more specific form, 
e.g., testing normality versus uniformity. Our goal is to propose a new generic procedure 
applicable to unbinned fits. 

It is not possible to design a versatile procedure applicable to all problems. For exam- 
ple, we can always choose the alternative distribution to be a set of 5-functions positioned 
precisely at the observed experimental points. In this case, the null hypothesis is inferior to 
the alternative hypothesis and the null hypothesis is rejected. This simply reflects the fact 
that the Type II error is undefined for the generic test stated in the previous paragraph. We 
would like to keep our procedure as generic as possible. Yet if more information about the 
alternative hypothesis is available, it should be possible to design a more powerful test for 
this specific alternative. 

We note that the standard x 2 binned test computes an average deviation of observed 
data from the expected density. However, in many experiments it is useful to focus on 
the maximal deviation instead of the average one. Consider, for example, fitting a one- 
dimensional histogram divided into 20 bins in the range [—10, 10] to the sum of a standard 
normal pdf N(0, 1) with zero mean and unit variance and a uniform pdf, as shown in Fig. 
The normal pdf represents signal (for example, mass of a certain resonance) and the uniform 
pdf represents background, with the magnitude of each component fixed to 100 entries. The 
X 2 deviation, computed as ]T bins (N expected - N observed ) 2 /N expected , is 19.34 per 20 degrees of 
freedom for each of the three fits, which results in a goodness-of-fit value of 50%. Hence, the 
procedure treats all fits as those of equal quality. In reality, of course, the experimenter will 
treat each fit in a different way. The top fit will be likely considered as "good" . The middle 
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fit will likely raise concern about a large background fluctuation in one bin. The bottom fit 
will likely make the experimenter suspect that the signal is not well modeled by the normal 
standard pdf with an area of 100. In fact, the experimenter is not really concerned about 
the x 2 deviation averaged over all bins. The more interesting question is: what are the bins 
that give largest x 2 deviations from expected values and how probable are these deviations? 
The method proposed in this note is designed to answer both these questions for unbinned 
fits. 





FIG. 1: Three fits with 50% goodness- 
of-fit values computed using the stan- 
dard x 2 method for binned data. Top 
plot - - the x 2 deviations are dis- 
tributed uniformly over the bins; mid- 
dle plot — the x 2 deviation is entirely 
due to one bin at the left edge of the 
histogram; bottom plot — the x 2 de- 
viation is produced by the two central 
bins. 



2. MAXIMUM LIKELIHOOD VALUE TEST 

The Maximum Likelihood Value (MLV) test is laid out in the BaBar Statistics Report £0] . 
For any quantity x that characterizes fit quality, the goodness-of-fit is given by 

1 -a = 1 - / f{x\H )dx , (1) 

J f(x\H )>f(x obs \H Q ) 



3 



where x Q b s is the value of x observed in the fit to the data, and f(x\H ) is the pdf of quantity 
x under the null hypothesis. For the MLV test, the quantity of interest is the likelihood and 
so C replaces x in the equation above. 

By construction, the MLV test can be only used to discriminate against a specific class 
of alternative hypotheses. Data are fitted to the density f{x\6) and an estimate of the 
parameter 9 = 9q is obtained from the fit. Then the null hypothesis H : 9 = 9 is tested 
against the alternative hypothesis Hi : 9 ^ 9q. Note that the overall validity of the density 
f(x\9) is never questioned. If the data are drawn from a drastically different pdf this test 
can produce a meaningless result. 

Consider, for instance, fitting a one-dimensional random sample to a standard normal 
pdf N(0, 1). In reality, however, the data are drawn from a sum of two narrow normal pdfs 
placed two units apart: N(— 1,0.01) and N(+l, 0.01), as shown in Fig. El Distributions of 
likelihood values computed under the null hypothesis for events drawn from the standard 
normal pdf N(0, 1) and events drawn from the sum of two normal pdfs are shown in Fig. |21 
Likelihood values computed under the null hypothesis N(0, 1) for the sum of two narrow 
normal pdfs are always consistent with the null hypothesis. The procedure does not have 
any discriminative power and the obtained fit always produces a reasonable goodness-of-fit 
value, even though the null hypothesis is clearly wrong. In this example an experimenter 
can easily find the problem by visual comparison of the distributions, but in the real world 
of multidimensional distributions such a comparison would be harder to make. 



FIG. 2: Densities for a standard normal pdf iV(0, 1) (solid line) and a sum of two narrow normal 
pdfs N(— 1,0.01) and iV(+l, 0.01) (dashed line) are shown on the left. — 21og£o distributions 
computed under the null hypothesis iV(0, 1) for both pdfs are shown on the right for 10,000 toy 
MC experiments with 10 events in each experiment. 



Why did the MLV test fail to reject the null hypothesis for the random sample described 
in the previous paragraph? Because the alternative hypothesis Hi was not "every other 
plausible distribution" but "another normal distribution". The price for this assumption 
was a futile test. It is true that the procedure would also discriminate against certain 
non- normal distributions. But it would work by accident, not by design. 
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Another good example is a test of uniformity. Maximum likelihood methods are useless 
here because under uniformity the likelihood value is constant, no matter how points are 
distributed. 1 

3. GENERIC TESTS 

3.1. Outline 

In the previous section, we established that the likelihood method does not address the 
problem stated in the Introduction. A more versatile approach is to test the null hypoth- 
esis without making specific assumptions about its alternative. We refer to such tests as 
"generic" . 

In Section 13.21 we briefly discuss information about generic tests that can be found in 
the statistics literature. Then we proceed with discussion of the Kolmogorov-Smirnov test, 
a well-known generic approach, in Section 13.41 and propose a new method in Section 13.61 
We emphasize that any generic test can be standardized by transforming the density of 
interest to uniform and performing a test of uniformity. The transformation to a uniform 
density is described in Section l3~31 and the definition of uniformity is discussed in Section 1331 
The transformation to a uniform density is not required for the Kolmogorov-Smirnov test 
but is essential for the proposed method. Subtleties related to the non-uniqueness of the 
uniformity transformation are discussed in Section l3~7l 

3.2. Statistics Literature 

There is a great amount of statistics literature on goodness-of-fit tests. Unfortunately, a 
great fraction of this literature is useless to us because of one of the following reasons: 

• The discussed problem is too specific, e.g., testing one specific type of pdf against 
another specific type of pdf. 

• Asympotic approximations, e.g., the central limit theorem, are used. 

• Authors concentrate on designing an analytic tool (inevitably based on some approx- 
imation) and dismiss MC simulation. 

We, on the other hand, would like to have a generic approach for unbinned fits with small 
numbers of events. We can rely on MC generators; hence, analyticity of the solution is not 
an issue. 

A well-known method that complies with these requirements is the Kolmogorov-Smirnov 
test. However, the Kolmogorov-Smirnov test lacks sensitivity for a broad class of alternative 
hypotheses. 

To our knowledge, the distance-to-nearest-neighbor method proposed in Section l3~o1 has 
not been described anywhere in the literature. As the idea seems obvious, it is quite possible 
that we are simply reinventing the wheel. But we hope that this wheel is worth reinventing. 



Unless there is an experimental point observed outside the range of definition of the uniform pdf. 
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3.3. Transformation to Uniform Density 

Before we proceed with discussion of various methods, we note that the problem can be 
standardized by transforming any n-dimensional pdf in question to uniform. This transfor- 
mation offers a number of advantages: 

• All problems are described by the same formalism. 

• The practical task of generating toy MC experiments is easily solved with a uniform 
random number generator. 

• It is easy to implement this transformation numerically in the absence of an analytic 
model for the pdf. 

Any n-dimensional random vector (x^\ x^ 2 \ x^) with a joint pdf f(x^\ x^ 2 \ x^) 
can be transformed to a vector (u^\ vS 2 \ u^) uniformly distributed on an n-dimensional 
unit cube < vS % ' < 1; i — 1,2, ...,n. This transformation is given by 

' = ^h{t,^ 2 \x^\...,x^)dt/f 2 {x^\x^\...,x^) 

= ^ f 2 {t,x^\x^\...,x^)dt/h{x^\x^\...,x^) 

(2) 

« ( "- X) = fn-l(t,X^)dt/f n ( X ^) 

> n) =r.2ut)dt 

where 

f 2 (xW,x( 3 \...,xW) = r+~/ 1 ( a; ( 1 ),x( 2 ),..,xW)dx( 1 ) 

< / 3 (i( 3 ),i( 4 ).-^ (n) ) = Jl + ~/ 2 (2 ( V (3 \---,£ (n W 2) (3) 

This transformation is one-to-one for a strictly positive pdf f(x). 

The cumulative density function (cd/) for an n-dimensional uniform distribution is simply 

n 

F{u) = Y[u {i) . (4) 

i=l 

3.4. Kolmogorov-Smirnov Test 

A generic method broadly known to physicists is the Kolmogorov-Smirnov test. The 
Kolmogorov-Smirnov statistic for a random sample x± , x 2 , ■ xpj of n-dimensional vectors 
x = (x^\ x^) with a cdf F(x) is given by 0,y| 

K N {F) = sup S€Vn \F(x) - F obs (x)\ , (5) 

where V n is an n-dimensional domain for the cdf F(x), and F b s (x) is the experimentally 
observed cdf. The null hypothesis is accepted if Kn(F ) is small and rejected if Kn(F ) is 
large, where F is a cdf for the null hypothesis. 
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Because the Kolmogorov-Smirnov test compares cumulative densities, it lacks sensitivity 
to fluctuations within small clusters. Consider, for example, two sets of points on a unit 
interval < x < 1: 

Set 1: x± = 1/4, x^ = 1/2, x 3 = 3/4, x 4 = 1 
Set 2: x\ = X2 = 1/4, x 3 = x± = 3/4 

Which one of these sets looks more uniform? The Kolmogorov-Smirnov test cannot differ- 
entiate between these two because the statistic (J5J) is 1/4 under uniformity in both cases. 

3.5. What Is "Uniform"? 

In fact, the question we asked in the previous section is not so simple. Can we indeed 
make a statement about which set is more likely to be drawn from a uniform distribution? 
The answer is: it depends. 

Suppose we search for a heavily-suppressed decay using BaBar data. We plot the mass 
distribution and we are convinced that background in our analysis is flat. We would like 
to know if there is an indication of any mass peaks in the plotted data. The data points 
in Set 1 from the previous section are equally spaced while the data points from Set 2 are 
grouped together in two clusters. Hence, Set 1 looks more uniform than Set 2. 

Consider now another example. We have a detector that registers ionizing particles. 
We would like to test the randomness of the particle flux, that is, the exponentiality of 
the distribution of time intervals between consecutive events. However, after an event is 
registered, the detector becomes inactive for a certain period of time. If the expected time 
interval between two consecutive events is much less than the detector's deadtime, the device 
will trigger at fixed time intervals. This would indicate that the process is not exponential 
but periodic. On these grounds, we would conclude that Set 1 looks less uniform than Set 2. 

We obtained two opposite answers to the same question. Of course, the question was 
not the same; in effect, these were two different questions. In the first example, the vaguely 
stated alternative hypothesis was "presence of peaks in the data". In the second example, 
it was "equidistant points on a finite interval" . We cannot design a test that gives the right 
answer for every possible problem. Nevertheless, it would be good to have a procedure which 
is more sensitive to clustering of data than the Kolmogorov-Smirnov test is. 

3.6. Distance-to-Nearest-Neighbor Test of Uniformity 

The idea of using the distance to nearest neighbor for a test of uniformity is not new 4, 
|Hl E|. For each data point, Ui = {u^\uf\ in an n-dimensional unit cube we find 

the nearest neighbor, Uj, and compute the distance, dy = \ui — Uj\. Uniformity is tested 
by comparing observed values of dij with those expected for a uniform distribution. In a 
more general approach, one can use an average distance d!f^ to m > 1 nearest neighbors. 
In Refs. 0, IE EJ> discussion revolves around using moments of distributions of distances 
d™ as test statistics. We propose a test of uniformity based on minimal and maximal 
values of the distance d^ to m nearest neighbors. It is intuitively clear that such test 
should be more sensitive to maximal deviations of observed data from the tested pdf than 
the Kolmogorov-Smirnov test is. 

A similar approach would be to use maximal and minimal volumes of Voronoi regions. 
A Voronoi region for a given observed point Ui is defined as a set of points inside the n- 
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dimensional unit cube which are closer to Ui than to any other observed point Uj, j ^ i. 
Voronoi regions have been used by the Sleuth algorithm |7[ to search for new physics at the 
DO experiment. In essence, Sleuth computes the probability of observing one data point 
in each Voronoi region based on the expectation value for background and marks Voronoi 
cells with low probabilities as candidates for a new physics signal. This method addresses 
the same question: how consistent are observed data with a null hypothesis, where the null 
hypothesis is defined as "background events only" . To a zeroth order, the volume of a Voronoi 
region around m + 1 points is proportional to the average size d\ of the cluster. Therefore, 
both methods for goodness-of-fit estimation are expected to produce similar results. This is 
confirmed by MC tests described in Section HJ However, using distance to nearest neighbor 
is computationally simpler because the construction of Voronoi regions can be avoided. 

3.7. Invariance of Test Statistic under Uniformity Transformation 

The transformation to uniformity is not unique, even if we limit the problem to continuous 
mappings. A transformation x — > u is continuous if two infinitely close points are mapped 
onto two infinitely close points, i.e., lim|£ i _£ j .|_ > o l^i — %| = 0. Consider, for example, a 
uniform pdf on a unit circle f(r, (ft) — l/V; < r < 1, < <f> < 2tt. The joint pdf of random 
variables 

r' = r , . 

(j)' = (j) + or; < \a\ < oo ^ ' 

is also uniform on the unit circle. However, this transformation does not conserve distance 
between two points. Another example is relabeling of variables in transformation (J2J) for 
a non-factorizable pdf in n > 1 dimensions. 

It is clear that all possible transformations to uniformity do not necessarily produce iden- 
tical values either for the Kolmogorov-Smirnov statistic or the distance to nearest neighbor. 
Inevitably, the value of goodness-of-fit for a specific set of experimental data depends on the 
choice of transformation. We do not consider this circumstance as a major obstacle. In many 
problems, it is possible to find a reasonable transformation to uniformity that preserves the 
natural metric of the experiment. 

In many particle physics experiments, observation variables are independent or weakly 
correlated. The pdf of interest is therefore factorizable or close to such. In this case, trans- 
formation (J2J) is reduced to u (i) = Fi(xW), % = 1,2,.. .,n, where Fi is a marginal cdf for zth 
component. The transformation above is the most obvious and natural choice. In other ex- 
periments, the pdf can be transformed to a factorizable one. For example, a two-dimensional 
normal pdf can be rotated to align the axes of the normal elliptic contour with the coordinate 
axes. 

If the pdf is severely non-factorizable, one can split n observation variables into k mutually 
independent (or weakly correlated) groups with n.j, i = 1,2, ...,k, variables in each group, 
n = rii + n 2 + ... + n^. Within each group, variables are strongly correlated and the marginal 
n^-dimensional pdf cannot be factorized. To obtain a test statistic invariant under relabeling 
of observation variables in transformation (j2J), one would have to try all nj permutations 
of variables within each group. For example, the minimal distance to nearest neighbor would 
be chosen as the minimum of all distances to nearest neighbor in these nj permutations. 
This method was proposed for a multidimensional Kolmogorov-Smirnov test. We simply 
restate it here in reference to the distance-to-nearest-neighbor approach. 



4. TESTS 



We consider four two-dimensional pdfs f(x,y): 

• normal pdf N(p,x = 0, \iy = 0, <j\ — 1, a\ — 1, p — 0) with zero means, unit variances 
and zero correlation between x and y 

• narrow normal pdf N(0, 0, 0.25, 0.25, 0) 

• sum of two narrow normal pdfs A(-1.3, 0, 0.01, 0.01, 0) and iV(+1.3, 0, 0.01, 0.01, 0) 

• uniform pdf defined on a square — 5 < x < 5; — 5 <y < 5 

For each density, we run 10,000 toy MC experiments with 10 events per experiment. We 
use the standard normal pdf A(0,0, 1, 1,0) as the null hypothesis (except one example, as 
discussed below) and plot in Fig. |3] likelihood values —2 log £ computed under the null 
hypothesis for all pdf s. Assuming the null hypothesis, we apply uniformity transformation 
to each MC experiment and plot values of the Kolmogorov-Smirnov statistic for all pdfs in 
Fig. 0] We also plot two-dimensional distributions of maximal versus minimal distance to 
nearest neighbor in Fig. |S| We use these MC distributions to estimate Type II errors for hy- 
pothesis tests at a given confidence level against each alternative to the null hypothesis. The 
confidence levels and errors are shown in Table 1. We repeat this exercise treating the uni- 
form pdf as the null hypothesis and testing it against the standard normal pdfN(0, 0, 1, 1, 0). 
This result is also shown in Table 1 and Fig. |U1 

With the definitions of the Type II error and confidence level shown in the Introduction, 
the smaller is the Type II error for a fixed confidence level, the more powerful is the test. 

We compared results obtained by the distance-to-nearest-neighbor method to those ob- 
tained through Voronoi regions and found no significant difference. 

The maximum likelihood method is very efficient for discriminating one normal pdf against 
another and against a uniform pdf which can be considered as a limiting case of a normal 
distribution with large variance. As expected, it fails to discriminate against two narrow 
normal pdfs because the implicit assumption of overall normality for the alternative hypoth- 
esis does not hold in this case. The distance-to-nearest-neighbor method performs better 
than the Kolmogorov-Smirnov approach for every test. This confirms our intuitive assump- 
tion about enhanced sensitivity of the distance-to-nearest-neighbor method to deviations of 
data from an expected pdf We note that the proposed distance-to-nearest-neighbor method 
is versatile as it provides some level of discrimination against every alternative hypothesis, 
although by no means should it be expected to provide the best discrimination against every 
alternative hypothesis. 

5. EXAMPLE: EVIDENCE FOR B -> K^l+l~ AT BaBar 

We apply the proposed distance-to-nearest-neighbor method to results of a B — > R(*H + l~ 
study [8( at BaBar. In this study, eight B — > K^l + l~ decays were investigated. Signal 
rate and upper limit estimates were obtained for these eight decays. We concentrate on 
two modes with measured signal yields: N(B + — > K + e + e~) = 14.4+42 and N(B + — > 
K + n + /i~) = 0.5^3 (statistical errors only). The former can be described cLS cL SI enificant 
measurement" while the latter can be used to set an upper limit. 
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FIG. 3: -2 log £ under the null 
hypothesis iV(0, 0, 1, 1, 0) for the four 
pdf s discussed in the text. A histogram 
for the uniform pdf is not shown be- 
cause it is far to the right. 



TABLE I: Confidence levels (CL) and Type II errors for the maximum likelihood value (MLV), 
Kolmogorov-Smirnov (KS), and distance-to-nearest-neighbor (DTNN) tests. DTNN Type II errors 
can be reduced for the iV(0, 0, 1, 1, 0)-vs-./V(0, 0, 0.25, 0.25, 0) test by imposing a two-dimensional 
linear cut on the distributions shown in Fig. 03 Such a cut was not used here because these values 
are for illustration only. 



Test 


CL 


Type II error 


Comment 


MLV test 


DTNN test 


KS test 


iV(0,0,l,l,0) vs uniform 


95% 
50% 


0.0% 
0.0% 


27.2% 
0.5% 


66.2% 
20.4% 


cutting on minimal 
distance for DTNN 


N(0, 0, 1, 1, 0) vs N(0, 0, 0.25, 0.25, 0) 


95% 
50% 


0.8% 
0.0% 


55.9% 
6.7% 


93.7% 
17.1% 


cutting on maximal 
distance for DTNN 


JV(0, 0, 1, 1, 0) vs two narrow normal pdf s 


95% 
50% 


100% 
97.1% 


0.0% 
0.0% 


100% 
18.4% 


cutting on maximal 
distance for DTNN 


uniform vs N(0, 0, 1, 1, 0) 


95% 
50% 


N/A 


0.6% 
0.0% 


75.6% 
0.1% 


cutting on maximal 
distance for DTNN 
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FIG. 4: The Kolmogorov-Smirnov 
statistic under the null hypothesis 
AT(0,0,1,1,0) for the four pdfs dis- 
cussed in the text. 



Signal yields in this analysis are obtained using unbinned maximum likelihood fits to 
two-dimensional distributions of energy versus mass shown in Fig. [7| The two-dimensional 
background is modeled by the pdf 



f(AE, m ES ) = A ■ exp(sAE) ■ m ES \j 1 - • exp 



< 1 



m ES 

El 



(7) 



where AE = Eku — Eb is the difference between the energy of the B candidate and the 
beam energy, Eb = 5.29 GeV/c 2 ; tues is the beam-constrained mass of the B candidate; s 
and £ are shape parameters; and A is a factor needed for proper normalization of the pdf. 
The signal shape is modeled by a normal-like function whose specific analytic expression is 
not important for this exercise. 

We ask the following question: How consistent are the observed data with the background 
pdf? In other words, we compute goodness-of-fit values assuming that all events come from 
the background. The background pdf (JJJ) is smooth while a hypothetical signal is expected 
to manifest itself through accumulation of events in a small region of the two-dimensional 
plot. In this case, the alternative hypothesis can be reasonably stated as "presence of peaks 
in the data" . Presence of peaks in the data would result in a smaller minimal distance d\ 
to m nearest neighbors than the one expected from the smooth background pdf. 

To estimate the goodness-of-fit, we transform the background pdf (J7J) to uniform using 
Eq. (J2J), generate 10,000 MC experiments and determine the goodness-of-fit as a fraction of 
these MC experiments where the minimal distance to nearest neighbor is less than the 
one observed in the data. We conclude that the B + — > K + e + e~ and B + — > K + fi + fi~ data 
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FIG. 5: Maximal vs minimal distance 
to nearest neighbor computed under 
the null hypothesis N(0, 0, 1, 1, 0) for 
the four pdf s discussed in the text. The 
histogram for the uniform pdf shows a 
very narrow peak at the left edge of the 
plot. 



are consistent with the fit at the 51% and 80% level, respectively. At this point, there is no 
indication of any peaks in the data. 

Now we repeat the exercise described in the previous paragraph for d\ m \ m > 1. 
Goodness-of-fit values are plotted versus m for both Kll modes in Fig. |HJ For a cluster 
of size 12, we estimate that the B + — > K + e + e~ data are consistent with the fit only at the 
0.13% level. At the same time, the goodness-of-fit for the B + — > K + fi + ^ data does not 
depend dramatically on the cluster size. The lowest goodness-of-fit value of 6.8% for the 
B + — > K + {i + fi~ data corresponds to the test with clusters of size 8. 

We conclude that the B + — > K + e + e~ data are inconsistent with the background density. 
Not surprisingly, the data cluster that gives the maximal deviation from the background pdf 
consists mostly of points located inside the signal region. 

6. SUMMARY 

We have proposed a new method for estimation of goodness-of-fit in multidimensional 
analysis using a distance-to-nearest-neighbor test of uniformity. This procedure is recom- 
mended as a more versatile tool than the maximum likelihood methods for a vague generic 
alternative hypothesis. However, if the alternative hypothesis is stated in more specific 
terms, other methods may be superior. 



0.9 

0.8 
0.7 

0. 
0. 
0. 
0. 
0. 
0. 



N(0,0,1,1,0) 



□□□□□□□□□□□□□□□□□□□ 

3 □ □□□□□□ 
□□□□□□□ 




0.9 
0.8 
0.7 
0.6 
0.5 
0.4 
0.3 
0.7 
0. 





N(0, 0,0.25, 0.25,0) 



p □□QIMOnnan □ □ = . . 

annrr " zenana ° c ° - • 

p □□UUUUnnann 




0.15 0.05 0.1 



12 



FIG. 6: Maximal vs minimal distance 
to nearest neighbor computed under 
the uniform null hypothesis for the uni- 
form pdf defined on a square — 5 < x < 
5; -5 < y < 5 and N(0, 0, 1, 1, 0) . 




FIG. 7: Difference AE (GeV) between the energy of the reconstructed B candidate and the beam 
energy versus beam-constrained mass rriEs (GeV/c 2 ) of the reconstructed B candidate. Data for 
the B + — ► K + e + e~ decay are shown on the left, and data for the B + — > K + /i + fi~ decay are shown 
on the right. Signal regions are shown with boxes. Data clusters that give maximal deviations 
from the expected pdf s are shown with open circles. 



13 




12 14 16 18 20 



4 6 



10 12 



FIG. 8: Goodness-of-fit (%) versus number of nearest neighbors (cluster size minus one) included 
in the goodness-of-fit calculation for the B + — > K + e + e~ data (left) and B + — > K + [i + n~ data 
(right). 
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